Use this template to complete your project throughout the course. Your Final Project presentation will be based on the contents of this document. Replace the title/name above and text below with your own, but keep the headers.

Overview

Give a brief a description of your project and its goal(s), what data you are using to complete it, and what three faculty/staff in different fields you have spoken to about your project with a brief summary of what you learned from each person. Include a link to your final project GitHub repository.

SARS-CoV-2 was first discovered in Wuhan, China later in the year of 2019, and the virus has spread rapidly to cause an international pandemic. Among the many countries impacted by COVID-19, the U.S. seems to be on top of the list in regards to nations who are the most detrimentally effected by the introduction of the virus to its population. As of December 1st, 2020, there have been more than fifteen million cases and over 250,000 associated deaths within the U.S. alone. Among the various factors of virulent spread of the disease, socioeconomic gradients of health play a vital role in disparities seen within the United States of America. As a result, such a topic needs to be explored further to lessen health disparities within our society by allocating resources for prevention and treatment of COVID-19 as well as other related illnesses.

The goal of the project is to: Examine socioeconomic and demographic factors that contribute to health disparities Observe the consequences of various demographic factors through different outcomes *Highlight how segments of the population might be marginalized in terms of Care

In order to explore such a domain, a variety of resources had to be used such as datasets from the Coronavirus Package, The New York Times COVID-19 dataset, and the CDC COVID-19 dataset. With the COVID-19 pandemic still prevalent in everyone’s lives, it is essential that any analysis that is implemented is done with the most up to data information. With this characteristics in mind, the report uses datasets that continuously update everydat. The first faculty member that I met with was Dr. Feng Rui within the Department of Biostatistics. After meeting with her, I learned about the datasets that she and her team uses to complete modeling of COVID-19 and associated projections which I used for the report. The second advisor to this project is Dr. Chen Yong who is also part of the Department of Biostatistics. Although Dr. Chong is deeply involved in machine learning processes within Penn’s campus, he briefly advised me to depart from my initial idea of modeling Covid-19 data and instead helped me with various statistical analysis within the results section of the report. Finally, the last advisor to the report is Dr. David Rubin of CHOP. As an individual who has previously completed interactive maps and COVID-19 modeling, Dr. Rubin was an excellent resource who provided instruction on completing COVID-19 interactive maps in relation to cases, deaths, and cases per capita.

Github Repository Link: https://github.com/Aaronc123814/BMIN503_Final_Project.git

Introduction

Describe the problem addressed, its significance, and some background to motivate the problem.

Explain why your problem is interdisciplinary, what fields can contribute to its understanding, and incorporate background related to what you learned from meeting with faculty/staff.

Good health is a vital pillar to live a satisfactory life, and it should be univerally attainable. Still, it is observed from the literature and the experiences of minorities, that the life experiences of marginalized populations greatly deviate from the life experiences of normal Americans. Although there are a variety of factors that contribute to such observations, SES and other demographic factors can provide a quantitative measurement that allows individuals to compare particular factors among cohorts and groups. For example, it is widely known that the median income, education status, and race are particular variables that can provide insight to how disparities exist within our society. Though the relationship between relevant demographic factors and health, is not unique to COVID-19 alone, this project seeks to present how such disparities become more evident as a result of the pandemic. In order to accomplish this project, it should be noted that a myriad of domains can help with its understanding such as biostatistics, computer science, medicine, and public health. As a result, the project seeks to educate its audience regarding the propensity of such a deeply entrenched problem for future solutions in relation to public policy implementation and systemic societal changes.

Methods

Describe the data used and general methodological approach. Subsequently, incorporate full R code necessary to retrieve and clean data, and perform analysis. Be sure to include a description of code so that others (including your future self) can understand what you are doing and why.

#Install Necessary Packages

install.packages("coronavirus")
#install.packages("devtools")
devtools::install_github("RamiKrispin/coronavirus")
install.packages("coronavirus")
install.packages("devtools")
devtools::install_github("RamiKrispin/coronavirus")
install.packages("EpiModel")
install.packages("gt")
install.packages("DiagrammeR")
install.packages("tictoc")
install.packages("incidence")
install.packages("earlyR")
install.packages("rnaturalearth")
install.packages("readxl")
#Load Necessary Packages

library(dplyr) 
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.4     ✓ stringr 1.4.0
## ✓ tidyr   1.1.2     ✓ forcats 0.5.0
## ✓ readr   1.4.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggplot2) 
library(coronavirus) 
library(devtools) 
## Loading required package: usethis
library(sf) 
## Linking to GEOS 3.8.1, GDAL 3.1.1, PROJ 6.3.1
covid19_df <- refresh_coronavirus_jhu() 
library(RColorBrewer)
library(leaflet)
library(rgdal)
## Loading required package: sp
## rgdal: version: 1.5-18, (SVN revision 1082)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.1.1, released 2020/06/22
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rgdal/gdal
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
## Path to PROJ shared files: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rgdal/proj
## Linking to sp version:1.4-4
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
library(readxl)
library(cowplot)

##Information Retrieval

#Information Retrieval
#Load in CDC, The New York Times, and other relevant data

us_counties <- readRDS(gzcon(url("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/uscounties_2010.rds")))

times_counties_data <- read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv") #The New York Times Data

times_state_data <- read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv") #The New York Times Data

per_capita_data <- read.csv("co-est2019-alldata.csv", header = TRUE) #ResearchLabData

covid_variables <- read.csv("Provisional_COVID-19_Death_Counts_by_Sex__Age__and_State.csv", header = TRUE) #CDC Data

covid_variables_race <- read.csv("Provisional_Death_Counts_for_Coronavirus_Disease__COVID-19___Distribution_of_Deaths_by_Race_and_Hispanic_Origin.csv", header = TRUE) #CDC Data

race <- read.csv("data-BYeoA.csv", header = TRUE) #ResearchLabData

##Global Cases Interactive Map

#Global Cases Interactive Map df

dir.create("Corona_Cases_Global") #creating directory for shp
## Warning in dir.create("Corona_Cases_Global"): 'Corona_Cases_Global' already exists
setwd("Corona_Cases_Global")      #arrive backt to Corona_Cases_Global directory
download.file("http://thematicmapping.org/downloads/TM_WORLD_BORDERS_SIMPL-0.3.zip", "Corona_Cases_Global.zip")        #download shapefile from website
unzip("Corona_Cases_Global.zip")  #unzip file
if (file.exists("Corona_Cases_Global.zip")) file.remove("Corona_Cases_Global.zip")
## [1] TRUE
setwd("..")                       #proceed back to parent file

spatial_polygon_df <- readOGR( 
  dsn= paste0(getwd(),"/Corona_Cases_Global/globe_shape_file/") , 
  layer="TM_WORLD_BORDERS_SIMPL-0.3",   #load shapefile as sp object
  verbose=FALSE)

global_cases <- st_as_sf(spatial_polygon_df) %>%  #select relevant data from df
  select(LON, LAT, NAME, FIPS)


global_cases <- dplyr::rename(global_cases, country = NAME) #create understandable identifiers

country_columns <- covid19_df %>% #select relevant data for future inner join
  select(location, location_code) %>%
  unique()

country_columns <- dplyr::rename(country_columns, country = location) #create understandable identifiers

summary_df <- coronavirus %>% #creating new df with specific attributes
  filter(type == "confirmed") %>%
  group_by(country) %>%
  summarise(total_cases = sum(cases)) 
## `summarise()` ungrouping output (override with `.groups` argument)
revised_corona_df <- inner_join(country_columns, summary_df, by="country") #new df with data

revised_corona_df <- dplyr::rename(revised_corona_df, FIPS = location_code) #for future inner join

Global_Cases_map <- inner_join(revised_corona_df, global_cases, by="FIPS") #inner join 

##U.S. COVID-19 Cases over Time

#df for U.S. Cases in relation to time 

summary_df <- coronavirus %>% #creating new df with specific attributes
  filter(type == "confirmed") %>% #filtering data type for confirmed cases
  filter(country == "US") #location needs to be within US

##U.S. County Cases Total

#Created df for COVID-19 County Cases in relation to Location

COVID_Cases_State <- times_counties_data %>%  #create new df 
  select(county, fips, cases, state) %>%      #select relevant data
  group_by(county, fips, state) %>%           #group by specific columns
  summarise(total_cases = sum(cases))         #calculate total cases per county
## `summarise()` regrouping output by 'county', 'fips' (override with `.groups` argument)
COVID_Cases_State$county <- paste(COVID_Cases_State$county, "County") #add "county" to column

us_counties$county <- paste(us_counties$NAME, "County") #add "county" to column

us_counties <- us_counties %>%  #simplify df
  select(-NAME)

Cases <- inner_join(COVID_Cases_State, us_counties, by="county") #create new df 

Cases <- Cases %>%
  select(county, state, GEO_ID, CENSUSAREA, geometry, total_cases, fips) #simplify df

##U.S. County Deaths Total

#Created df for COVID-19 County Deaths in relation to Location

COVID_Deaths_State <- times_counties_data %>% #create new df
  select(county, fips, deaths, state) %>%     #select relevant data
  group_by(county, fips, state) %>%           #group by specific columns
  summarise(total_deaths = sum(deaths))       #calculate total cases per county
## `summarise()` regrouping output by 'county', 'fips' (override with `.groups` argument)
COVID_Deaths_State$county <- paste(COVID_Deaths_State$county, "County") #add "county" to column

Deaths <-  inner_join(COVID_Deaths_State, us_counties, by="county") #create new df 
Deaths <- Deaths %>%
  select(county, state, GEO_ID, CENSUSAREA, geometry, total_deaths, fips) #select relevant data from df

##U.S. County Cases Per Capita

#df for COVID-19 Cases per Capita

per_capita_data <- per_capita_data %>% #create new df
  select(CTYNAME, POPESTIMATE2019)     #select relevant data

per_capita_data <- dplyr::rename(per_capita_data, county = CTYNAME) #rename column for innerjoin

per_capita_data <-inner_join(per_capita_data, Cases, by="county")   #add "county" to column

per_capita <- per_capita_data$total_cases/per_capita_data$POPESTIMATE2019 #calculate Cases per capita

per_capita_data <- per_capita_data %>%  #create new df
  mutate(per_capita)                    #create new column within df

COVID-19 Deaths by Gender and Age Cohorts

#df for COVID-19 Deaths in relation to Gender and Age

covid_variables <- dplyr::rename(covid_variables, Age_group = Age.group, COVID_Deaths = COVID.19.Deaths) #rename columns with appropriate titles

covid_sex_age_deaths <- covid_variables %>%  #create new df
  select(Sex, Age_group, COVID_Deaths ) %>%  #select relevant data
  filter(Sex == "Female" | Sex == "Male") %>% #filter for specific genders
  filter(Age_group != "All Ages" ) %>% #exclude variable within column
  na.omit() #remove any NAs

covid_sex_age_deaths$Sex <- as.factor(covid_sex_age_deaths$Sex) #convert sex into factor

##COVID-19 Deaths by Gender

# df for Total COVID-19 Deaths in relation to Gender 

gender_graphs <- covid_variables %>%
  filter(Sex == "Male" | Sex == "Female") %>%
  filter(Age_group == "All Ages") %>%
  filter(State == "United States") %>%
  select(Sex, COVID_Deaths)

##COVID-19 Deaths by Race

#df for COVID-19 Deaths with Respect to Race

race <- dplyr::rename(race, Race = X.1, Actual_Deaths = Actual, Age_Adjusted = "Age.adjusted")

##Food Insufficiency by Race

#df for Food Insufficiency with Respect to Race

food_insufficiency_race <- read_excel("food3b_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
food_insufficiency_race <- dplyr::rename(food_insufficiency_race,Total = ...2, Enough_of_the_types_of_food_wanted = ...3, Enough_food_but_not_always_the_types_wanted = ...4, Sometimes_not_enough_to_eat = ...5, Often_not_enough_to_eat = ...6, Race = "Food Table 3b. Food Sufficiency for Households with Children, in the Last 7 Days, by Select Characteristics: United States")

food_insufficiency_race <- food_insufficiency_race[18:22, ] %>%
  select(-...7) 

food_insufficiency_race$Total <- as.numeric(food_insufficiency_race$Total)

food_insufficiency_race$Enough_of_the_types_of_food_wanted <- as.numeric(food_insufficiency_race$Enough_of_the_types_of_food_wanted)

food_insufficiency_race$Enough_food_but_not_always_the_types_wanted <- as.numeric(food_insufficiency_race$Enough_food_but_not_always_the_types_wanted)

food_insufficiency_race$Sometimes_not_enough_to_eat <- as.numeric(food_insufficiency_race$Sometimes_not_enough_to_eat)

food_insufficiency_race$Often_not_enough_to_eat <- as.numeric(food_insufficiency_race$Often_not_enough_to_eat)

difficulty_eating <- food_insufficiency_race$Sometimes_not_enough_to_eat + food_insufficiency_race$Often_not_enough_to_eat

percent_difficulty_eating <- (difficulty_eating/food_insufficiency_race$Total) * 100

food_insufficiency_race <- food_insufficiency_race %>%
  mutate(percent_difficulty_eating)

##Food Insufficiency by Household Income

#df for Food Insufficiency with Respect to Household Income

food_insufficiency_income <- read_excel("food3b_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
food_insufficiency_income  <- dplyr::rename(food_insufficiency_income,Total = ...2, Enough_of_the_types_of_food_wanted = ...3, Enough_food_but_not_always_the_types_wanted = ...4, Sometimes_not_enough_to_eat = ...5, Often_not_enough_to_eat = ...6, Household_Income = "Food Table 3b. Food Sufficiency for Households with Children, in the Last 7 Days, by Select Characteristics: United States")

food_insufficiency_income  <- food_insufficiency_income [72:79, ] %>%
  select(-...7) 

food_insufficiency_income $Total <- as.numeric(food_insufficiency_income $Total)

food_insufficiency_income$Enough_of_the_types_of_food_wanted <- as.numeric(food_insufficiency_income $Enough_of_the_types_of_food_wanted)

food_insufficiency_income$Enough_food_but_not_always_the_types_wanted <- as.numeric(food_insufficiency_income $Enough_food_but_not_always_the_types_wanted)

food_insufficiency_income $Sometimes_not_enough_to_eat <- as.numeric(food_insufficiency_income $Sometimes_not_enough_to_eat)

food_insufficiency_income$Often_not_enough_to_eat <- as.numeric(food_insufficiency_income $Often_not_enough_to_eat)

difficulty_eating <- food_insufficiency_income$Sometimes_not_enough_to_eat + food_insufficiency_income$Often_not_enough_to_eat

percent_difficulty_eating <- (difficulty_eating/food_insufficiency_income$Total) * 100

food_insufficiency_income <- food_insufficiency_income %>%
  mutate(percent_difficulty_eating)

##Diffculty Paying Rent by Race

#df for Difficulty Paying Rent in Relation to Race

rent_race <- read_excel("housing1b_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
rent_race  <- dplyr::rename(rent_race,Total = ...2, Occupied_Without_Rent = ...3, Household_Currently_Caught_Up_On_Rent_Payments = ...4, Household_Currently_Not_Caught_Up_On_Rent_Payments = ...5, Race = "Housing Table 1b. Last Month’s Payment Status for Renter-Occupied Housing Units, by Select Characteristics: United States")

rent_race <- rent_race [18:22, ] %>%
  select(-...7, -...6) 

rent_race$Total <- as.numeric(rent_race$Total)

rent_race$Occupied_Without_Rent <- as.numeric(rent_race$Occupied_Without_Rent)

rent_race$Household_Currently_Caught_Up_On_Rent_Payments <- as.numeric(rent_race $Household_Currently_Caught_Up_On_Rent_Payments)

rent_race$Household_Currently_Not_Caught_Up_On_Rent_Payments <- as.numeric(rent_race$Household_Currently_Not_Caught_Up_On_Rent_Payments)

percent_difficulty_rent <- (rent_race$Household_Currently_Not_Caught_Up_On_Rent_Payments/rent_race$Total) * 100

rent_race <- rent_race %>%
  mutate(percent_difficulty_rent)

##Diffculty Paying Rent by Household Income

# df for Difficulty Paying Rent in Relation to Household Income

rent_income <- read_excel("housing1b_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
rent_income <- dplyr::rename(rent_income,Total = ...2, Occupied_Without_Rent = ...3, Household_Currently_Caught_Up_On_Rent_Payments = ...4, Household_Currently_Not_Caught_Up_On_Rent_Payments = ...5, Household_Income = "Housing Table 1b. Last Month’s Payment Status for Renter-Occupied Housing Units, by Select Characteristics: United States")

rent_income <- rent_income [54:61, ] %>%
  select(-...7, -...6) 

rent_income$Total <- as.numeric(rent_income$Total)

rent_income$Occupied_Without_Rent <- as.numeric(rent_income$Occupied_Without_Rent)

rent_income$Household_Currently_Caught_Up_On_Rent_Payments <- as.numeric(rent_income $Household_Currently_Caught_Up_On_Rent_Payments)

rent_income$Household_Currently_Not_Caught_Up_On_Rent_Payments <- as.numeric(rent_income$Household_Currently_Not_Caught_Up_On_Rent_Payments)


percent_difficulty_rent <- (rent_income$Household_Currently_Not_Caught_Up_On_Rent_Payments/rent_race$Total) * 100
## Warning in rent_income$Household_Currently_Not_Caught_Up_On_Rent_Payments/rent_race$Total: longer object length is not a multiple of shorter object length
rent_income <- rent_income %>%
  mutate(percent_difficulty_rent)

##Diffculty Paying Rent by Education Status

#df for Difficulty Paying Rent in Relation to Education Status

rent_education <- read_excel("housing1b_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
rent_education <- dplyr::rename(rent_education,Total = ...2, Occupied_Without_Rent = ...3, Household_Currently_Caught_Up_On_Rent_Payments = ...4, Household_Currently_Not_Caught_Up_On_Rent_Payments = ...5, Education_Status = "Housing Table 1b. Last Month’s Payment Status for Renter-Occupied Housing Units, by Select Characteristics: United States")

rent_education <- rent_education [24:27, ] %>%
  select(-...7, -...6) 

rent_education$Total <- as.numeric(rent_education$Total)

rent_education$Occupied_Without_Rent <- as.numeric(rent_education$Occupied_Without_Rent)

rent_education$Household_Currently_Caught_Up_On_Rent_Payments <- as.numeric(rent_education $Household_Currently_Caught_Up_On_Rent_Payments)

rent_education$Household_Currently_Not_Caught_Up_On_Rent_Payments <- as.numeric(rent_education$Household_Currently_Not_Caught_Up_On_Rent_Payments)

percent_difficulty_rent <- (rent_education$Household_Currently_Not_Caught_Up_On_Rent_Payments/rent_education$Total) * 100

rent_education <- rent_education %>%
  mutate(percent_difficulty_rent)

##Diffculty Paying Expenses by Race

# df for Difficulty Paying Expenses in Relation to Race

expenses_race <- read_excel("spending1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
expenses_race <- dplyr::rename(expenses_race,Total = ...2, Not_at_all_difficult = ...3, A_little_difficult = ...4, Somewhat_difficult = ...5, Very_difficult = ...6, Race = "Household Spending Table 1. Difficulty Paying Usual Household Expenses in the Last 7 Days, by Select Characteristics: United States")

expenses_race <- expenses_race [18:22, ] %>%
  select(-...7) 

expenses_race$Total <- as.numeric(expenses_race$Total)

expenses_race$Not_at_all_difficult <- as.numeric(expenses_race$Not_at_all_difficult)

expenses_race$A_little_difficult <- as.numeric(expenses_race$A_little_difficult)

expenses_race$Somewhat_difficult <- as.numeric(expenses_race$Somewhat_difficult)

expenses_race$Very_difficult <- as.numeric(expenses_race$Very_difficult)

difficulty_expenses <- expenses_race$Somewhat_difficult + expenses_race$Very_difficult

percent_difficulty_expenses <- (difficulty_expenses/expenses_race$Total) * 100

expenses_race <- expenses_race %>%
  mutate(percent_difficulty_expenses)

##Diffculty Paying Expenses by Household Income

#df for Difficulty Paying Expenses in Relation to Household Income

expenses_income <- read_excel("spending1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
expenses_income <- dplyr::rename(expenses_income,Total = ...2, Not_at_all_difficult = ...3, A_little_difficult = ...4, Somewhat_difficult = ...5, Very_difficult = ...6, Household_Income = "Household Spending Table 1. Difficulty Paying Usual Household Expenses in the Last 7 Days, by Select Characteristics: United States")

expenses_income <- expenses_income[53:60, ] %>%
  select(-...7) 

expenses_income$Total <- as.numeric(expenses_income$Total)

expenses_income$Not_at_all_difficult <- as.numeric(expenses_income$Not_at_all_difficult)

expenses_income$A_little_difficult <- as.numeric(expenses_income$A_little_difficult)

expenses_income$Somewhat_difficult <- as.numeric(expenses_income$Somewhat_difficult)

expenses_income$Very_difficult <- as.numeric(expenses_income$Very_difficult)

difficulty_expenses <- expenses_income$Somewhat_difficult + expenses_income$Very_difficult

percent_difficulty_expenses <- (difficulty_expenses/expenses_income$Total) * 100

expenses_income <- expenses_income %>%
  mutate(percent_difficulty_expenses)

##Diffculty Paying Expenses by Education Status

#df for Difficulty Paying Expenses in Relation to Education Status

expenses_education <- read_excel("spending1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
expenses_education <- dplyr::rename(expenses_education,Total = ...2, Not_at_all_difficult = ...3, A_little_difficult = ...4, Somewhat_difficult = ...5, Very_difficult = ...6, Education_Status = "Household Spending Table 1. Difficulty Paying Usual Household Expenses in the Last 7 Days, by Select Characteristics: United States")

expenses_education <- expenses_education[24:27, ] %>%
  select(-...7) 

expenses_education$Total <- as.numeric(expenses_education$Total)

expenses_education$Not_at_all_difficult <- as.numeric(expenses_education$Not_at_all_difficult)

expenses_education$A_little_difficult <- as.numeric(expenses_education$A_little_difficult)

expenses_education$Somewhat_difficult <- as.numeric(expenses_education$Somewhat_difficult)

expenses_education$Very_difficult <- as.numeric(expenses_education$Very_difficult)

difficulty_expenses <- expenses_education$Somewhat_difficult + expenses_education$Very_difficult

percent_difficulty_expenses <- (difficulty_expenses/expenses_education$Total) * 100

expenses_education <- expenses_education %>%
  mutate(percent_difficulty_expenses)

##Unemployment by Household Income

#df for Unemployment with Respect to Household Income

employment_income <- read_excel("employ1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
employment_income <- dplyr::rename(employment_income,Total = ...2, Did_Not_Experience_Loss_of_Employment_since_March = ...3, Did_Experience_Loss_of_Employment_since_March = ...4, Household_Income = "Employment Table 1. Experienced and Expected Loss of Employment Income, by Select Characteristics: United States")

employment_income <- employment_income[53:60, ] %>%
  select(-...7, -...6, -...5, -...8) 

employment_income$Total <- as.numeric(employment_income$Total)

employment_income$Did_Experience_Loss_of_Employment_since_March <- as.numeric(employment_income$Did_Experience_Loss_of_Employment_since_March)

employment_income$Did_Not_Experience_Loss_of_Employment_since_March <- as.numeric(employment_income$Did_Not_Experience_Loss_of_Employment_since_March)

Percent_Unemployment <- (employment_income$Did_Experience_Loss_of_Employment_since_March/employment_income$Total) * 100

employment_income <- employment_income %>%
  mutate(Percent_Unemployment)

##Unemployment by Education Status

# df for Unemployment with Respect to Education Status

employment_education <- read_excel("employ1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
employment_education <- dplyr::rename(employment_education,Total = ...2, Did_Not_Experience_Loss_of_Employment_since_March = ...3, Did_Experience_Loss_of_Employment_since_March = ...4, Education_Status = "Employment Table 1. Experienced and Expected Loss of Employment Income, by Select Characteristics: United States")

employment_education <- employment_education[24:27, ] %>%
  select(-...7, -...6, -...5, -...8) 

employment_education$Total <- as.numeric(employment_education$Total)

employment_education$Did_Experience_Loss_of_Employment_since_March <- as.numeric(employment_education$Did_Experience_Loss_of_Employment_since_March)

employment_education$Did_Not_Experience_Loss_of_Employment_since_March <- as.numeric(employment_education$Did_Not_Experience_Loss_of_Employment_since_March)

Percent_Unemployment <- (employment_education$Did_Experience_Loss_of_Employment_since_March/employment_education$Total) * 100

employment_education <- employment_education %>%
  mutate(Percent_Unemployment)

##Unemployment with Respect to Health Status

#df for Unemployment with Respect to Health Status

employment_health <- read_excel("employ2_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
employment_health <- dplyr::rename(employment_health,Total = ...2, Government = ...3, Private_Company = ...4, Nonprofit_Organization = ...5, Self_Employed = ...6, Family_Business = ...7, Health_Status = "Employment Table 2. Employment Status and Sector of Employment, by Select Characteristics: United States")

employment_health <- employment_health[47:51, ] %>%
  select(-...8, -...9, -...10) 

employment_health$Total <- as.numeric(employment_health$Total)

employment_health$Government <- as.numeric(employment_health$Government)

employment_health$Private_Company <- as.numeric(employment_health$Private_Company)

employment_health$Nonprofit_Organization <- as.numeric(employment_health$Nonprofit_Organization)

employment_health$Nonprofit_Organization <- as.numeric(employment_health$Nonprofit_Organization)

employment_health$Self_Employed <- as.numeric(employment_health$Self_Employed)

employment_health$Family_Business <- as.numeric(employment_health$Family_Business)

gov_sum <- sum(employment_health$Government)

PC_sum <- sum(employment_health$Private_Company)

NO_sum <- sum(employment_health$Nonprofit_Organization)

SE_sum <- sum(employment_health$Self_Employed)

FB_sum <- sum(employment_health$Family_Business)

employment_health$Government <- (employment_health$Government/gov_sum) * 100

employment_health$Private_Company <- (employment_health$Private_Company/PC_sum) * 100

employment_health$Nonprofit_Organization <- (employment_health$Nonprofit_Organization/NO_sum) * 100

employment_health$Self_Employed <- (employment_health$Self_Employed/SE_sum) * 100

employment_health$Family_Business <- (employment_health$Family_Business/FB_sum) * 100

employment_health <- pivot_longer(employment_health, c(Government, Private_Company, Nonprofit_Organization, Self_Employed, Family_Business), "Occupation")

##Access to Medical Care with Respect to Race

#Access to Medical Care with Respect to Race

access_race <- read_excel("health1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
access_race <- dplyr::rename(access_race, Difficult = ...2, No_Difficulty = ...3, Race = "Health Table 1. Coronavirus Pandemic Related Problems with Access to Medical Care, in Last 4 weeks, by Select Characteristics: United States")

    
access_race <- access_race[21:25, ] %>%
  select(-...7, -...6, -...5, -...4) 

access_race <- pivot_longer(access_race, c(Difficult, No_Difficulty), "Access_to_Care")

##Access to Medical Care with Respect to Household Income

#df for Access to Medical Care with Respect to Household Income

access_income <- read_excel("health1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
access_income <- dplyr::rename(access_income, Difficult = ...2, No_Difficulty = ...3, Household_Income = "Health Table 1. Coronavirus Pandemic Related Problems with Access to Medical Care, in Last 4 weeks, by Select Characteristics: United States")

    
access_income <- access_income[57:64, ] %>%
  select(-...7, -...6, -...5, -...4) 

access_income <- pivot_longer(access_income, c(Difficult, No_Difficulty), "Access_to_Care")

##Access to Medical Care with Respect to Education Status

# df for Access to Medical Care with Respect to Education Status

access_educ <- read_excel("health1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
access_educ <- dplyr::rename(access_educ, Difficult = ...2, No_Difficulty = ...3, Education_Status = "Health Table 1. Coronavirus Pandemic Related Problems with Access to Medical Care, in Last 4 weeks, by Select Characteristics: United States")

    
access_educ <- access_educ[27:30, ] %>%
  select(-...7, -...6, -...5, -...4) 

access_educ <- pivot_longer(access_educ, c(Difficult, No_Difficulty), "Access_to_Care")

##Access to Medical Care with Respect to Age Cohorts

#df for Access to Medical Care with Respect to Age Cohorts

access_age <- read_excel("health1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
## * ...
access_age <- dplyr::rename(access_age, Difficult = ...2, No_Difficulty = ...3, Age_Group = "Health Table 1. Coronavirus Pandemic Related Problems with Access to Medical Care, in Last 4 weeks, by Select Characteristics: United States")

    
access_age <- access_age[10:16, ] %>%
  select(-...7, -...6, -...5, -...4) 

access_age <- pivot_longer(access_age, c(Difficult, No_Difficulty), "Access_to_Care")

##Insurance Status with Respect to Race

#df for Insurance Status with Respect to Race

insurance_race <- read_excel("health3_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
insurance_race <- dplyr::rename(insurance_race, Insured_Total = ...2, Private_Insured = ...3, Public_Insured = ...4, Uninsured = ...5, Race = "Health Table 3. Current Health Insurance Status, by Select Characteristics: United States")

insurance_race <- insurance_race[18:22, ] %>%
  select( -...6) 

insurance_race$Insured_Total <- as.numeric(insurance_race$Insured_Total)

insurance_race$Private_Insured <- as.numeric(insurance_race$Private_Insured)

insurance_race$Public_Insured <- as.numeric(insurance_race$Public_Insured)

insurance_race$Uninsured <- as.numeric(insurance_race$Uninsured)

Total_Pop <- cbind(insurance_race, Total = rowSums(insurance_race[, 3:5]))

insurance_race <- insurance_race %>%
  mutate(Total_Pop)

insurance_race$Uninsured <- (insurance_race$Uninsured/insurance_race$Total) * 100

insurance_race$Private_Insured <- (insurance_race$Private_Insured/insurance_race$Total) * 100

insurance_race$Public_Insured <- (insurance_race$Public_Insured/insurance_race$Total) * 100

insurance_race <- insurance_race %>%
  select(-Insured_Total, -Total)

insurance_race <- pivot_longer(insurance_race, c(Uninsured, Private_Insured, Public_Insured ), "Insurance_Status")

##Insurance Status with Respect to Household Income

#df for Insurance Status with Respect to Household Income

insurance_income <- read_excel("health3_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
insurance_income <- dplyr::rename(insurance_income, Insured_Total = ...2, Private_Insured = ...3, Public_Insured = ...4, Uninsured = ...5, Household_Income = "Health Table 3. Current Health Insurance Status, by Select Characteristics: United States")

insurance_income <- insurance_income[54:61, ] %>%
  select( -...6) 

insurance_income$Insured_Total <- as.numeric(insurance_income$Insured_Total)

insurance_income$Private_Insured <- as.numeric(insurance_income$Private_Insured)

insurance_income$Public_Insured <- as.numeric(insurance_income$Public_Insured)

insurance_income$Uninsured <- as.numeric(insurance_income$Uninsured)

Total_Pop <- cbind(insurance_income, Total = rowSums(insurance_income[, 3:5]))

insurance_income <- insurance_income %>%
  mutate(Total_Pop)

insurance_income$Uninsured <- (insurance_income$Uninsured/insurance_income$Total) * 100

insurance_income$Private_Insured <- (insurance_income$Private_Insured/insurance_income$Total) * 100

insurance_income$Public_Insured <- (insurance_income$Public_Insured/insurance_income$Total) * 100

insurance_income <- insurance_income %>%
  select(-Insured_Total, -Total)

insurance_income <- pivot_longer(insurance_income, c(Uninsured, Private_Insured, Public_Insured ), "Insurance_Status")

##Insurance Status with Respect to Education Status

#df for Insurance Status with Respect to Education Status

insurance_educ <- read_excel("health3_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * `` -> ...6
insurance_educ <- dplyr::rename(insurance_educ, Insured_Total = ...2, Private_Insured = ...3, Public_Insured = ...4, Uninsured = ...5, Education_Status = "Health Table 3. Current Health Insurance Status, by Select Characteristics: United States")

insurance_educ <- insurance_educ[24:27, ] %>%
  select( -...6) 

insurance_educ$Insured_Total <- as.numeric(insurance_educ$Insured_Total)

insurance_educ$Private_Insured <- as.numeric(insurance_educ$Private_Insured)

insurance_educ$Public_Insured <- as.numeric(insurance_educ$Public_Insured)

insurance_educ$Uninsured <- as.numeric(insurance_educ$Uninsured)

Total_Pop <- cbind(insurance_educ, Total = rowSums(insurance_educ[, 3:5]))

insurance_educ <- insurance_educ %>%
  mutate(Total_Pop)

insurance_educ$Uninsured <- (insurance_educ$Uninsured/insurance_educ$Total) * 100

insurance_educ$Private_Insured <- (insurance_educ$Private_Insured/insurance_educ$Total) * 100

insurance_educ$Public_Insured <- (insurance_educ$Public_Insured/insurance_educ$Total) * 100

insurance_educ <- insurance_educ %>%
  select(-Insured_Total, -Total)

insurance_educ <- pivot_longer(insurance_educ, c(Uninsured, Private_Insured, Public_Insured ), "Insurance_Status")

##Medicare Applications based on Age Cohorts

#df for Medicare Applications based on Age Cohorts

medicare_age <- read_excel("socsec1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
medicare_age <- dplyr::rename(medicare_age, Total = ...2, Percent_Applied = ...3, Percent_No_Application = ...4, Age = "Social Security Table 1. Applications for Social Security, Supplemental Security Income, or Medicare Benefits during the Coronavirus Pandemic, by Select Characteristics: United States")

medicare_age <- medicare_age[9:13, ] %>%
  select( -...5) 

medicare_age$Total <- as.numeric(medicare_age$Total)

medicare_age$Percent_Applied  <- as.numeric(medicare_age$Percent_Applied)

medicare_age$Percent_No_Application <- as.numeric(medicare_age$Percent_No_Application)


medicare_age$Percent_Applied <- (medicare_age$Percent_Applied/medicare_age$Total) * 100

medicare_age <- medicare_age %>%
  select(-Total, -Percent_No_Application)

##Medicare Applications based on Age Cohorts

#df for Medicare Applications based on Age Cohorts

medicare_income <- read_excel("socsec1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
medicare_income <- dplyr::rename(medicare_income, Total = ...2, Percent_Applied = ...3, Percent_No_Application = ...4, Household_Income = "Social Security Table 1. Applications for Social Security, Supplemental Security Income, or Medicare Benefits during the Coronavirus Pandemic, by Select Characteristics: United States")

medicare_income <- medicare_income[53:60, ] %>%
  select( -...5) 

medicare_income$Total <- as.numeric(medicare_income$Total)

medicare_income$Percent_Applied  <- as.numeric(medicare_income$Percent_Applied)

medicare_income$Percent_No_Application <- as.numeric(medicare_income$Percent_No_Application)

medicare_income$Percent_Applied <- (medicare_income$Percent_Applied/medicare_income$Total) * 100

medicare_income <- medicare_income %>%
  select(-Total, -Percent_No_Application)

##Medicare Applications based on Family Size

#df for Medicare Applications based on Family Size

medicare_fs <- read_excel("socsec1_week18.xlsx", col_names = TRUE)
## New names:
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
medicare_fs <- dplyr::rename(medicare_fs, Total = ...2, Percent_Applied = ...3, Percent_No_Application = ...4, Family_Size = "Social Security Table 1. Applications for Social Security, Supplemental Security Income, or Medicare Benefits during the Coronavirus Pandemic, by Select Characteristics: United States")

medicare_fs <- medicare_fs[35:41, ] %>%
  select( -...5) 

medicare_fs$Total <- as.numeric(medicare_fs$Total)

medicare_fs$Percent_Applied  <- as.numeric(medicare_fs$Percent_Applied)

medicare_fs$Percent_No_Application <- as.numeric(medicare_fs$Percent_No_Application)


medicare_fs$Percent_Applied <- (medicare_fs$Percent_Applied/medicare_fs$Total) * 100

medicare_fs <- medicare_fs %>%
  select(-Total, -Percent_No_Application)

Results

Describe your results and include relevant tables, plots, and code/comments used to obtain them. End with a brief conclusion of your findings related to the question you set out to address. You can include references if you’d like, but this is not required.

##Global Cases Interactive Map

#Global Cases Interactive Map

pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)], 
                    bins = c(0, 10000, 17467, 25000, 50984, 70000, 90000, 1000000, 3000000, 23821196), reverse = TRUE,
                    NULL)

message <- paste0(Global_Cases_map$country,   
                     "<br>Total Cases by Country: ",
                  Global_Cases_map$total_cases)

leaflet(st_as_sf(Global_Cases_map)) %>%
   addPolygons(stroke = FALSE,                        
               fillColor = ~pal_fun(total_cases),
               fillOpacity = 0.5, smoothFactor = 0.5, 
               popup = message) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%   
   addLegend("bottomright",                           
             pal=pal_fun,                             
             values=~total_cases,               
             title = 'Total Global Cases',                  
             opacity = 1) %>%                         
   addScaleBar()

##U.S. Cases with Respect the Date Bar Graph

#U.S. Cases with Respect the Date
ggplot(data = summary_df, aes(x = date, y = cases)) +
      geom_bar(stat = "identity", fill = "red") +
      labs(title = "US Confirmed Cases vs. Date",
           x = "Date", y = "New Reported Cases in the U.S.") +
      geom_smooth(color = "black", method = "gam") +
      theme_cowplot(12)
## `geom_smooth()` using formula 'y ~ s(x, bs = "cs")'

##U.S. Cases by County Interactive Map

#U.S. Cases by County

pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)], 
                    bins = c(0, 10000, 17467, 35000, 52984, 100000, 157300, 1000000, 3000000, 23821196), reverse = TRUE,
                    NULL)

message <- paste0(Cases$county,", ", Cases$state,   
                     "<br>Total Cases by County: ",
                  Cases$total_cases)

leaflet(st_as_sf(Cases)) %>%
   addPolygons(stroke = FALSE,                        
               fillColor = ~pal_fun(total_cases),
               fillOpacity = 0.5, smoothFactor = 0.5, 
               popup = message) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%   
   addLegend("bottomright",                           
             pal=pal_fun,                             
             values=~total_cases,               
             title = 'Total Cases',                  
             opacity = 1) %>%                         
   addScaleBar()

##U.S. Deaths by County Interactive Map

#U.S. Deaths by County

pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)], 
                    bins = c(0, 300, 1000, 3000, 6000, 15000, 300000), reverse = TRUE,
                    NULL)

message <- paste0(Cases$county,", ", Deaths$state,   
                     "<br>Total Deaths by County: ",
                  Deaths$total_deaths)

leaflet(st_as_sf(Deaths)) %>%
   addPolygons(stroke = FALSE,                        
               fillColor = ~pal_fun(total_deaths),
               fillOpacity = 0.5, smoothFactor = 0.5, 
               popup = message) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%   
   addLegend("bottomright",                           
             pal=pal_fun,                             
             values=~total_deaths,               
             title = 'Total Deaths',                  
             opacity = 1) %>%                         
   addScaleBar()

##U.S. Cases per Capita by County Interactive Map

#U.S. Cases per Capita by County

pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)], 
                    bins = c(0, 2.5, 5, 7, 15, 100), reverse = TRUE,
                    NULL)

message <- paste0(per_capita_data$county,", ", per_capita_data$state,   
                     "<br>Cases by Per Capita: ",
                  per_capita_data$per_capita)

leaflet(st_as_sf(per_capita_data)) %>%
   addPolygons(stroke = FALSE,                        
               fillColor = ~pal_fun(per_capita),
               fillOpacity = 0.5, smoothFactor = 0.5, 
               popup = message) %>%
   addProviderTiles(providers$CartoDB.Positron) %>%   
   addLegend("bottomright",                           
             pal=pal_fun,                             
             values=~per_capita,               
             title = 'Cases by Per Capita',                  
             opacity = 1) %>%                         
   addScaleBar()

##COVID-19 Deaths based on Gender and Age Bar Graph

#COVID-19 Deaths based on Gender and Age

ggplot(data = covid_sex_age_deaths, aes(x = Age_group, fill = factor(Sex))) + 
    geom_bar(position = "dodge") +
    ggtitle("COVID-19 Deaths based on Gender and Age") +
    xlab("Age Cohort") +
    ylab("Current Total Deaths") +
    coord_flip() +
    theme_cowplot(12)

    facet_grid(. ~ Sex) 
## <ggproto object: Class FacetGrid, Facet, gg>
##     compute_layout: function
##     draw_back: function
##     draw_front: function
##     draw_labels: function
##     draw_panels: function
##     finish_data: function
##     init_scales: function
##     map_data: function
##     params: list
##     setup_data: function
##     setup_params: function
##     shrink: TRUE
##     train_scales: function
##     vars: function
##     super:  <ggproto object: Class FacetGrid, Facet, gg>
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) 
## List of 1
##  $ axis.text.x:List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 1
##   ..$ vjust        : NULL
##   ..$ angle        : num 45
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi FALSE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  - attr(*, "class")= chr [1:2] "theme" "gg"
##  - attr(*, "complete")= logi FALSE
##  - attr(*, "validate")= logi TRUE

##COVID-19 Deaths in relation to Gender Bar Graph

#COVID-19 Deaths in relation to Gender

ggplot(data = gender_graphs, aes(x = Sex, y = COVID_Deaths, fill = Sex)) +
    geom_col() +
    ggtitle("Gender and Deaths") +
    theme_cowplot(12) +
    xlab("Sex") +
    ylab("COVID-19 Total Deaths")

##COVID-19 Deaths per 100,000 People Bar Graph

#COVID-19 Deaths per 100,000 People

ggplot(data = race, aes(x = Race, y = Actual_Deaths, fill = Race)) +
    geom_col() +
    ggtitle("COVID-19 Deaths per 100,000 People") +
    theme_cowplot(12) +
    xlab("Race") +
    ylab("COVID-19 Total Deaths")

##COVID-19 Deaths per 100,000 People Age-Adjusted Bar Graph

#COVID-19 Deaths per 100,000 People Age-Adjusted

ggplot(data = race, aes(x = Race, y = Age_Adjusted, fill = Race)) +
    geom_col() +
    ggtitle("COVID-19 Deaths per 100,000 People Age-Adjusted") +
    theme_cowplot(12) +
    xlab("Race") +
    ylab("COVID-19 Total Deaths")

##Food Insufficiency #Food Insufficiency During Pandemic by Race Bar Graph

#Food Insufficiency During Pandemic by Race

ggplot(data = food_insufficiency_race, aes(x = Race, y = percent_difficulty_eating, fill = Race)) +
    geom_col() +
    coord_flip() +
    ggtitle("Food Insufficiency During Pandemic by Race") +
    theme_cowplot(10) +
    xlab("Race") +
    ylab("% Population with Difficulty Eating")

#Food Insufficiency During Pandemic by Household Income Bar Graph

#Food Insufficiency During Pandemic by Household Income

ggplot(data = food_insufficiency_income, aes(x = Household_Income, y = percent_difficulty_eating, fill = Household_Income)) +
    geom_col() +
    coord_flip() +
    ggtitle("Food Insufficiency During Pandemic by Household Income") +
    xlab("Household Income") +
    ylab("% Population with Difficulty Eating") +
    theme_cowplot(12)

##Difficulties Paying Rent #Difficulties Paying Rent by Race Bar Graph

#Difficulties Paying Rent by Race

ggplot(data = rent_race, aes(x = Race, y = percent_difficulty_rent, fill = Race)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties Paying Rent by Race During Pandemic") +
    theme_cowplot(10) +
    xlab("Race") +
    ylab("% Population with Difficulty Paying Rent")

#Difficulties Paying Rent by Household Income Bar Graph

#Difficulties Paying Rent by Household Income

ggplot(data = rent_income, aes(x = Household_Income, y = percent_difficulty_rent, fill = Household_Income)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties Paying Rent by Household Income with Pandemic") +
    theme_cowplot(12) +
    xlab("Household Income") +
    ylab("% Population with Difficulty Paying Rent")

#Difficulties Paying Rent by Education Strata Bar Graph

#Difficulties Paying Rent by Education Strata

ggplot(data = rent_education, aes(x = Education_Status, y = percent_difficulty_rent, fill = Education_Status)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties Paying Rent by Education Strata") +
    theme_cowplot(12) +
    xlab("Education Status") +
    ylab("% Population with Difficulty Paying Rent During Pandemic")

##Difficulties Paying Expenses #Difficulties Paying Expenses by Race Bar Graph

#Difficulties Paying Expenses by Race

ggplot(data = expenses_race, aes(x = Race, y = percent_difficulty_expenses, fill = Race)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties with Paying Expenses by Race During Pandemic") +
    theme_cowplot(10) +
    xlab("Race") +
    ylab("% Population with Difficulty Paying Expenses")

#Difficulties Paying Expenses by Household Income Bar Graph

#Difficulties Paying Expenses by Household Income

ggplot(data = expenses_income, aes(x = Household_Income, y = percent_difficulty_expenses, fill = Household_Income)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties with Paying Expenses by Household Income") +
    theme_cowplot(12) +
    xlab("Household Income") +
    ylab("% Population with Difficulty Paying Expenses During Pandemic")

#Difficulties Paying Expenses by Education Status Bar Graph

#Difficulties Paying Expenses by Education Status

ggplot(data = expenses_education, aes(x = Education_Status, y = percent_difficulty_expenses, fill = Education_Status)) +
    geom_col() +
    coord_flip() +
    ggtitle("Difficulties Paying Expenses by Education Status") +
    theme_cowplot(12) +
    xlab("Education Status") +
    ylab("% Population with Difficulty Paying Expenses During Pandemic")

##Percent Unemployment #Percent Unemployment by Household Income Bar Graph

#Percent Unemployment by Household Income

ggplot(data = employment_income, aes(x = Household_Income, y = Percent_Unemployment, fill = Household_Income)) +
    geom_col() +
    coord_flip() +
    ggtitle("Percent Share of Unemployment by Household Income") +
    theme_cowplot(12) +
    xlab("Household Income") +
    ylab("% Population Unemployed During Pandemic")

#Percent Unemployment by Education Status Bar Graph

#Percent Unemployment by Education Status

ggplot(data = employment_education, aes(x = Education_Status, y = Percent_Unemployment, fill = Education_Status)) +
    geom_col() +
    coord_flip() +
    ggtitle("Share of Unemployment by Education Status") +
    theme_cowplot(12) +
    xlab("Education Status") +
    ylab("% Population Unemployed During Pandemic")

##Health Status by Occupation Type Bar Graph

#Health Status by Occupation Type 

ggplot(data = employment_health, aes(x = Health_Status, y = value, fill = Health_Status)) +
    geom_col() + 
    scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Health Status by Occupation Type During Pandemic") +
    facet_wrap(~Occupation) +
    coord_flip() +
    theme_cowplot(12) +
    xlab("Health Status") +
    ylab("% of Total Population")

##Medical Care Access #Medical Care Access by Race Bar Graph

#Medical Care Access by Race

ggplot(data = access_race, aes(x = Race, y = value, fill = Race)) +
    geom_col() + 
    scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Medical Care Access by Race During Pandemic") +
    facet_wrap(~Access_to_Care, ncol = 1, scales="free_y") +
    theme_cowplot(12) +
    xlab("Race") +
    ylab("Total Individuals within Sample")

#Medical Care Access by Household Income Bar Graph

#Medical Care Access by Household Income

ggplot(data = access_income, aes(x = Household_Income, y = value, fill = Access_to_Care)) + 
    geom_col(position = "dodge") +
    ggtitle("COVID-19 Deaths based on Gender and Age") +
    coord_flip() +
    theme_cowplot(12) +
    facet_wrap(~ Access_to_Care, ncol =1) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    ggtitle("Issues with Access to Medical Care by Household Income") +
    xlab("Household Income") +
    ylab("Total Population within Sample")

#Medical Care Access by Education Status Bar Graph

#Medical Care Access by Education Status

ggplot(data = access_educ, aes(x = Education_Status, y = value, fill = Education_Status)) +
    geom_col() + 
    coord_flip() +
    scale_y_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Medical Care Access Issues by Education Status") +
    facet_wrap(~Access_to_Care, scales="free_y", ncol =1) +
    theme_cowplot(12) +
    xlab("Education Status") +
    ylab("Total Population within Sample")

#Medical Care Access by Age Cohort Bar Graph

#Medical Care Access by Age Cohort

ggplot(data = access_age, aes(x = Age_Group, y = value, fill = Age_Group)) +
    geom_col() + 
    coord_flip() +
    scale_y_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Medical Care Access Issues by Age Cohort") +
    facet_wrap(~Access_to_Care, scales="free_y", ncol =1) +
    theme_cowplot(12) +
    xlab("Age Cohort") +
    ylab("Total Population within Sample")

##Insurance Status #Insurance Status by Race Bar Graph

#Insurance Status by Race

ggplot(data = insurance_race, aes(x = Race, y = value, fill = Race)) +
    geom_col() + 
    scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Insurance Status by Race During Pandemic") +
    facet_wrap(~Insurance_Status, ncol = 1, scales="free_y") +
    theme_cowplot(12) +
    xlab("Race") +
    ylab("% of Population Sampled")

#Insurance Status by Household Income Bar Graph

#Insurance Status by Household Income

ggplot(data = insurance_income, aes(x = Household_Income, y = value, fill = Household_Income)) +
    geom_col() + 
    scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Insurance Status by Household Income") +
    facet_wrap(~Insurance_Status, ncol = 1, scales="free_y") +
    theme_cowplot(12) +
    xlab("Household Income") +
    ylab("% of Population Sampled During Pandemic")

#Insurance Status by Education Strata Bar Graph

#Insurance Status by Education Strata

ggplot(data = insurance_educ, aes(x = Education_Status, y = value, fill = Education_Status)) +
    geom_col() + 
    scale_x_discrete(guide = guide_axis(check.overlap = TRUE)) +
    ggtitle("Insurance Status by Education Strata") +
    facet_wrap(~Insurance_Status, ncol = 1, scales="free_y") +
    theme_cowplot(12) +
    xlab("Education Status") +
    ylab("% of Population Sampled During Pandemic")

##Medicare Applications #Medicare Applications by Age Cohort Bar Graph

#Medicare Applications by Age Cohort

ggplot(data = medicare_age, aes(x = Age, y = Percent_Applied, fill = Age)) +
    geom_col( position = "dodge") +
    theme_cowplot(12) +
    ggtitle("Recent Medicare Applications by Age During Pandemic") +
    xlab("Age Cohort") +
    ylab("% of Population with Applications")

#Medicare Applications by Household Income Bar Graph

#Medicare Applications by Household Income

ggplot(data = medicare_income, aes(x = Household_Income, y = Percent_Applied, fill = Household_Income)) +
    geom_col(position = "dodge") +
    coord_flip() +
    theme_cowplot(12) +
    ggtitle("Recent Medicare Applications by Household Income") +
    xlab("Household Income") +
    ylab("% of Population with Applications During Pandemic")

#Medicare Applications by Family Size Bar Graph

#Medicare Applications by Family Size

ggplot(data = medicare_fs, aes(x = Family_Size, y = Percent_Applied, fill = Family_Size)) +
    geom_col(position = "dodge") +
    coord_flip() +
    theme_cowplot(12) +
    ggtitle("Recent Medicare Applications by Family Size") +
    xlab("Family Size") +
    ylab("% of Population with Applications During Pandemic")

###Conclusion

In the first portion of the project, the report examined COVID-19 cases on a global scale, and it seems that in comparison to other countries, the U.S. tops the list with the highest number of cases. As the report further examined the U.S. specifically, it is evident that the cases and deaths seem to be rising continuously without any sign of a plateau within the last couple of months. Furthermore, it is clear that states in the northeast, south, and west coast particularly show the highest volume of cases among other states from other regions. Upon proper analysis of COVID-19 cases per capita, the report portrays an even distribution throughout the country showing that controlling for population yields variable results.
As COVID-19 takes a toll on the american population, it is only natural that individuals question which segments of the population are not only asymmetrically at risk for contracting the virus but also most likely to lack proper access to care and resources necessary for prevention and treatment. Based on recent analysis, it has been shown that older individuals are more likely to die from the virus while gender does not seem to be related to death or contraction of the virus. In addition, race was also observed to be a predictor of COVID-19 deaths as minority populations from the African American, Indigenous, and Latino communities showed the highest proportion of deaths when population and age was controlled for. 
A variety of factors that influence the results above are mentioned as associations that may influence such pathways, but it is not clear that these specific mechanisms cause such higher rates of death alone. For example, it is clear that individuals who are below the poverty line, uneducated, and belonging to minority populations are suffering from environmental stressors such as food insufficiency, inability to pay expenses, and access to proper medical care. A surprising finding that is contrary to the pre-study analysis is that unemployment is clearly affecting individuals of the population who earn more than $200,000 with higher degrees of education as compared to uneducated, lower-income individuals. Still, individuals who are belonging to minority populations and below the poverty threshold are more likely to be uninsured or have public insurance. Future policies need to be enacted that consider such variabilities in the population. Because of the inherent limitations on the datasets, a chi-square analysis could not be performed that outputs values to show significance. As a result, other studies need to explore such an area and find or create datasets that have such attributes. Finally, the results of the report show the demand for further inquiries that need to be made to investigate the causal pathways that depict how population subgroups are at more risk for COVID-19 susceptibility and death as this report shows a clear relationship between minority marginalization within the scope of the healthcare domain.